EXPLAIN clause

Back to DuckDB Data Engineering Glossary

Overview

The EXPLAIN clause is a powerful diagnostic tool that shows how DuckDB plans to execute your SQL query. When you add EXPLAIN before any SQL statement, instead of running the query, DuckDB will display the sequence of operations (known as the query plan) it would use to produce the results.

Basic Usage

The simplest way to use EXPLAIN is to prefix it to your query:

Copy code
EXPLAIN SELECT * FROM customers WHERE country = 'USA';

EXPLAIN ANALYZE

DuckDB also supports EXPLAIN ANALYZE, which both shows the query plan and actually executes the query, providing additional runtime statistics about how long each operation took and how many rows were processed:

Copy code
EXPLAIN ANALYZE 
SELECT customer_name, COUNT(*) 
FROM orders 
GROUP BY customer_name 
HAVING COUNT(*) > 10;

Understanding Query Plans

DuckDB's query plans are displayed as a tree structure, with each operation indented to show its relationship to other operations. The operations are executed from bottom to top, with the results flowing upward through the tree. Key components you'll see include:

SCAN operations that read data from tables
FILTER operations that implement WHERE clauses
JOIN operations that combine tables
GROUP BY operations that aggregate data
PROJECTION operations that select specific columns

For example:

Copy code
EXPLAIN
SELECT r.region_name, COUNT(*)
FROM customers c
JOIN regions r ON c.region_id = r.id
GROUP BY r.region_name;

This will show how DuckDB plans to join the tables, group the results, and count the records for each region.

DuckDB Specifics

Unlike some databases that use complex notation or require special formatting to read query plans, DuckDB presents its plans in a clear, hierarchical text format that's easy to read. DuckDB's query planner is also designed specifically for analytical workloads, so you'll often see operations optimized for processing large amounts of data in parallel using vectorized execution.